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ABSTRACT : 

The previous study in this series showed that evaluation of 
R&D activities rests eventually on qualitative judgments. 

The purpose of this study was to develop, validate, and test 
apply a procedure for obtaining qualitative judgments econom- 
ically and efficiently. The Ford procedure for scaling par- 
tially ordered sets of rankings was programmed and validated 
using an abstract judgmental task with an extrinsic criterion. 
It was given a trial application requiring the ordering on 
merit of current personnel research projects. Both validation 
and trial application results were highly satisfactory. It 
was concluded that the Ford procedure could be used to obtain 
scaled qualitative judgments in a wide variety of settings 
with accuracy, efficiency, and economy. Flow charts, data 
setup, and the complete computer program are given. 

This research was supported in part by the Personnel Research 
Division, Bureau of Naval Personnel, through Project Order No. 
1-0001, Naval Personnel Research and Development Laboratory. 
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PREFACE 



This report is the second in a research project between the 
sponsoring activity, the Personnel Research Division of the Bureau 
of Naval Personnel, and the Naval Postgraduate School. The study was 
performed under the auspices of Capt. G. F. Britner, Division Direc- 
tor, and Mr. A. A. Sjoholm, Technical Director, Personnel Research 
Division. 

We would like to express our thanks to Dr. Frank M. Andrews, 
Survey Research Center, Institute for Social Research, University of 
Michigan, for providing a copy of the Michigan Ford Program on which 
much of this work was based. 

Portions of this work were done for a master’s thesis in opera- 
tions research by the junior author under the direction of the senior 
author . 

Various aspects of this work were presented at the Research 
and Development Working Group, 28th Military Operations Research Sym- 
posium, Ft. Lee, Va., in November 1971, and at the XIXth International 
Meeting of The Institute for Management Sciences, Houston, Texas, in 
April 1972. The distribution list reflects the requests for this 
paper as a result of these presentations. It is hoped that recipients 
of this report will find it useful in the many different contexts of 
research indicated by their addresses and positions. 
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Evaluat-on and Innovation in the Navy^s Personnel Research Laborator- 
ies, II. Development, Validation, and Trial Application of a Computer 
Program to Facilitate Judgmental Appraisal, by James K. Arima and 
Richard W. Mister. 



BRIEF 

The previous study in this series showed that there are no gen- 
erally applicable, hard measures of the effectiveness of an R&D labor- 
atory’s activities and operations. The basis for determining effective- 
ness of a laboratory eventually narrows down to the judgments of per- 
sons who, for various reasons, are deemed qualified to make such judg- 
ments . 



This being the case, it follows that the evaluation process can 
be improved by developing practical methods for obtaining and process- 
ing judgments that are simple to apply, broadly applicable, and faith- 
fully reflect the contribution of each judge. Ideally, the results 
should be expressed quantitatively to permit their use in conjunction 
with other statistical and mathematical tools. 

To have these characteristics, a method should permit an indi- 
vidual judge, faced with a set of alternatives to ’’prioritize”, to rate 
only those with which he is familiar, to set his own measurement scale, 
and to make use of ties when he sees no difference between alternatives. 
The Ford procedure permits a judge to behave in this manner. It was 
originally programmed for computer application by the Survey Research 
Center, University of Michigan. The program was obtained and adapted 
for use on the computing facilities of the Naval Postgraduate School 
(NPS) which uses an IBM 360/67 system. The program, along with explan- 
atory instructions, is reproduced in this report. 

To prove the Ford program was broadly applicable and effective, 
a validation test was conducted using an abstract, vague, rating task 
for which there was — unknown to the judges — an independent set of quan- 
titative ’’truth” data for comparison. Next, a trial application of the 
program was made in which Navy officers rated current personnel research 
projects as to the advisability of retaining and pursuing them in the 
R&D program. Finally, the Ford procedure was used in a real-life situa- 
tion to analyze student ratings of courses in the NPS operations research 
program. The Ford rating procedure and NPS computer program were highly 
satisfactory in all of these test applications. 

It was concluded that a simple, effective, and broadly useful 
procedure for obtaining and scaling the evaluative opinions of judges 
had been developed, tested, and applied. The suggestion was made to 
use the procedures to analyze project selection in the Navy’s personnel 
research laboratories, since it is widely recognized that, for a labora- 
tory to be effective, it must be working on the right programs at the 
right time. 
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I. PURPOSE AND SCOPE 



The previous study in this series (Arima, 1971) discussed var- 
ious factors associated with the effectiveness of Federal in-house 
laboratories. The problem of evaluating the effectiveness of a spe- 
cific laboratory, such as the Navy’s personnel research laboratories, 
was of special interest. Approaches to this evaluation problem seemed 
ultimately to require a qualitative assessment of a laboratory’s ef- 
fectiveness or some aspect of its operations by knowledgeable indi- 
viduals. Accordingly, one specific problem identified as a result of 
the preliminary study was to develop and test a method for obtaining 
and analyzing such assessments from qualified judges in an economic, 
convenient, and effective manner. This report addresses itself to 
this problem. 

The approach taken to solve the problem, explicated in the 
pages that follow, was: (1) Adapt Ford’s (1957) procedure, as pro- 

grammed by Pelz and Andrews (1966) , for creating numerical rankings 
from a set of incomplete comparisons of objects by a group of judges 
to operate on the Naval Postgraduate School’s IBM 360/67 system, (2) 
validate the procedures using an arbitrary task with an extrinsic cri- 
terion measure, and (3) test the feasibility of using the procedures 
to obtain an ordered set of qualitative judgments on an R&D problem 
appropriate to the environment and mission of the Navy’s personnel 



research laboratories. 
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II. THE FORD PROCEDURE 
A. FLEXIBILITY OF PROCEDURES. 

There are three characteristics of Ford^s procedure that make 
it especially appropriate for obtaining judgments on several alterna- 
tives or items from a diverse group of judges. First, a judge or 
rater adjudicates only those items that he feels competent to judge. 
Second, he can make his judgments as coarse or as fine as he desires 
because there is no restriction on how many judgmental categories he 
must use. And third, there is no requirement for a fixed distribution 
of items among the categories, except that, collectively over judges, 
no more than one third of all items being rated should be in any one 
category. A judge, for example, might decide to judge only half of a 
pool of items using three categories — high, medium and low. 

The ease of this method can be compared with other frequently 
used methods that may require one or more of the following restric- 
tions: all items must be ranked with no ties, each items is to be com- 

pared with every other item with no indeterm.inate category permitted, 
an equal number of items must be placed in each rating category, and 
so forth. Such restrictions are usually imposed because of statisti- 
cal considerations in the analytical procedures. Unfortunately, 
persons who are unfamiliar with the statistical considerations are 
alienated against the results of the procedures because, while serv- 
ing as judges, they had to make too many arbitrary decisions in which 
they felt no confidence. A more serious consequence of such proce- 
dures is the fact that a large amount of noise might be added to the 
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judgments so that the ’’signal” present in the judgments cannot be dis- 
criminated. Moreover, some of the techniques, such as paired compari- 
sons, are excessively demanding of a judge’s time. Thus, the statis- 
tical rigor is offset by serious negative consequences of the proce- 
dures involved. 

At this time, it should be noted that the procedures being de- 
veloped here are not the same as those designed to achieve a consensus 
or decision among a group of judges, such as some applications of the 
Delphi technique. These procedures tend to be used when the number of 
alternatives and judges are few, when any of the alternatives are rea- 
sonable choices, and when the problem is one of reaching consensus 
rather than evaluating the relative merit of the alternatives. The 
procedures tend to disregard the contribution of the individual and 
depend on devious group processes and feedback to eliminate, eventually, 
any individuality not consonant with the prevailing group trend. It 
should be pointed out that there is no way to determine to what extent 
the final decision is based on the relative merits of the items enter- 
ing into the decision and on the group processes employed in arriving 
at a consensus. The procedures being developed here, on the other hand, 
produce a composite judgment that reflects the contribution of each 
judge according to the proportionate number of judgments he makes. The 
results of the procedure do not, however, produce a clear-cut decision 
or unanimity of opinion. Other factors and other methods must be em- 
ployed for the decision-making process using the composite judgments 
as a data base. Bartee ( 1971 ), for example, suggests a linear program- 
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ining approach with zero-one variables. In many cases, however, the 
scaled alternatives might be an end in themselves with actions taking 
on priorities according to their scaled values. 



B. DETAILS OF THE FORD PROCEDURE 

The Ford procedure is based on forming a win-loss matrix, 

A = (a^j), where a^j represents the number of times object i is 

preferred over object j by the judges, and a^^ = 0. Moreover, all 
ties and non judged items do not enter the matrix for any one judge 
since a win-loss determination has not been made. Thus, each judge 
contributes to the composite judgment only those pairwise instances 
in which he has preferred one alternative over another. The Ford pro- 
cedure then determines a weight, w^, for each item. These weights 
are interpreted as odds in the sense that the probability of item i 

being preferred to item j in any comparison is taken to be w^/(w^ + Wj). 

These probabilities could then be used to compute matrix A. The set 
of these weights is the maximum likelihood of obtaining the original 
matrix, A. The weights are obtained by solving iteratively the equation 
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n+1 -j 
w . = 



A 



I j-j .n 

j w'? + 

1 J 



(1) 



where a^^ = number of times object i was preferred to object j; 

= number of times object j was preferred to object i; w^ = 

th n 

weight assigned to object i on the n iteration; and w. = weight 

t VT 

assigned to object j on the n iteration. The weights are win 
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percentages on the first iteration. The iteration stops in the com- 
puter program when a predetermined convergence criterion is reached or 
a predetermined number of iterations has been completed. 

There was one assumption in Ford*s procedure that made it dif- 
ficult to apply in practice. This was a partition assumption which 
stated that in any partition of the win-loss matrix into two nonempty 
subsets, some item in each subset had to be preferred at least once to 

some item in the other subset. That is, the initial w. and w. could 

1 J 

not be 1 and 0 in equation (1) . This rule would be broken in the 
case of universally high and universally low alternatives and in any 
subset where all judgments are in one direction. Pelz and Andrews 
(1966) solved this problem by first removing universally high and low 
items from the win-loss matrix before computing the weights and by 
adding a very small constant, .00001, to each of the remaining entries 
in the matrix. These procedures permitted them to program Ford*s pro- 
cedures for computer processing of judgments involving 130 judges and 
130 items. Accordingly, the Pelz and Andrews program was used as a 
starting point for adapting Ford’s procedure to the Naval Postgraduate 
School’s IBM 360/67 system. The program as adapted for the IBM 360/67 
system will hereafter be referred to as the Ford program. 

C. THE FORD PROGRAM 

A flow-chart of the program is included at Appendix I. The data 
assembly for input to the program is shown in Appendix II. The program, 
itself, with explanatory comments is reproduced at Appendix III. 
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Two decisions are required by the person using the program. 

First, he must specify the convergence criterion for the iterative de- 
termination of the weights. This report uses .005. That is, when the 
weights do not change by that amount in successive iterations, a satis- 
factory stabilization of the weights is accepted. Second, the user 
must specify how many iterations are to be conducted in the event the 
convergence criterion is not reached. This reports uses 50. As will 
be shown, the rank ordering of the items, as determined from their 
weights, stabilizes rapidly. Accordingly, even if the convergence cri- 
terion is not met, the rank ordering is acceptable. When the conver- 
gence criterion is met, the weights can be used as an interval scaling 
of the judged items. 

The program operates in three subroutines or cores. The first 
core assigns an ID number (hereafter called ’’assigned ID number”) to 
each rated alternative as it is read into the computer and them com- 
putes how many comparisons are to be made between pairs of alternatives, 
excluding ties. 

The second core forms the win-loss matrix, eliminates universal 
highs and lows, assigns the small constant to each cell, and then com- 
putes the initial weights. 

The third core performs the iterations until the weights stab- 
ilize or until the specified number of iterations have been run. The 
results are printed out showing a list of judges and the number of 
comparisons made. The output gives a mapping of the assigned ID numbers 
to the original numbers used for input of the variables. The win-loss 
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matrix is shown by assigned ID number. Finally, there is a printout 
of the weights by iterations and a list of final weights shown by as- 
signed ID number and giving the corresponding original ID number. 

HI. VALIDATION OF THE FORD PROGRAM 
A. THE VALIDATION PROBLEM 

Pelz and Andrews (1966) showed some comparisons of the Ford pro- 
cedure with alternative methods for scaling partially ordered judgments 
Having shown the computational advantages of the Ford procedure, they 
then demonstrated its utility in their evaluation of scientists in or- 
ganizations. They did this by having laboratory directors rate their 
scientists as to their excellence in scientific research using the Ford 
procedure. These ratings were then scaled and used as the criterion 
variable in their studies. It should be noted, however, that the valid 
ity of these ratings was not established in a psychometric sense (Amer- 
ican Psychological Association, 1954) , other than that of face validity 
That is, they were not subjected to a critical comparison against some 
outside criterion. 

Among the other forms of validity — concurrent, predictive, and 
construct — concurrent validity of the scaled judgments would be of 
most interest when the judgments are to be used as a criterion measure, 
dependent variable, objective function, or, in general, a measure of 
effectiveness. That is, we would like to know how well the judgments 
represent the true state of the world that they are presumed to repre- 
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sent. This is particularly true when, as in the case of the Ford pro- 
cedure, judgments which are ordinal in nature are mapped to the system 
of real numbers and used as a cardinal measure. For the application 
made by Pelz and Andrews, we would like to know how accurately the 
scaled ratings represent the true effectiveness of the rated scientists 
Stated in this form, the difficulty or impossibility of assessing the 
concurrent validity of the scaled ratings becomes readily apparent: 
judgments of this type are used because there is no other acceptable 
measure of the variable in which interest lies. 

In view of the foregoing, it follows that an existing, scaled 
variable is needed to validate the Ford program. In its simplest form, 
validation might take on the paradigm of a psychophysical experiemnt. 
For example, a set of standard weights might be presented to judges 
with the task of rating the relative heaviness of the weights. There 
would be little interest in such a test of the Ford procedure, since 
it would be a straightforward evaluation of a numerical estimation func 
tion as the size of the weights vary. In a validation of the Ford pro- 
cedure, interest lies in the nature of the underlying quality of pairs 
of objects as they are judged and what the relationship is of the per- 
ceived quality to the decisions of the judges. This distinction in 
emphasis is elaborated in detail by Krantz (1972). The test in a 
psychophysical paradigm might be more relevant, for example, if the 
judges had to rate the weights of objects differing considerably in 
size and mass. Thus, an ideal validation of the Ford procedure would 
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take place if judges were to rate items according to an abstract or 
vague variable for which there is, unknown to them, a corresponding 
quantitative, objective variable that could serve as a criterion mea- 
sure. Unfortunately, the more vague or abstract a judging task becomes, 
the more difficult it is to find a criterion variable that is also not 
equally vague. Accordingly, validation of the Ford procedure with a 
challenging task will be less than rigorous and any discrepancy of the 
resulting scaled judgments from the criterion values may be due to 
several factors which will not be independently assessable. These in- 
clude the difficulty of the judgmental task, the capability of the 
judges, the reliability of the criterion variable, and the efficiency 
of the Ford program. The validation, then, will be clinical, and hope- 
fully diagnostic, while attempting to be rigorous. 

B . METHOD 

1. Stimulus Materials. 

Fortunately, there is a situation that compares favorably with 
the ideal validation paradigm mentioned above. It has been found that 
such abstract characteristics or qualities of words as their familiarity, 
meaningfulness, and associational richness are closely related to the 
frequency with which they appear in the English language (Broadbent, 

1967; Ekstrand, Wallace, & Underwood, 1966; Underwood, 1966). Fortu- 
nately, too, the frequency of 30,000 words has been cataloged in what 
has become known as the Thorndike and Lorge (1944) word count. Now, it 
can be assumed that most individuals are not consciously aware of the 
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f^icL that familiarity, say, of English words depends on their frequency. 
Accordingly, it should be possible to ask judges to rate a list of se- 
lected words from the Thorndike and Lorge word count for their famil- 
iarity to persons in general and compare the Ford-scaled ratings with 
the Thorndike and Lorge word count, thus completing the validation. 

Rather than selecting words directly from the Thorndike and 
Lorge word count, an intermediate procedure was inserted to provide 
some structure to the judging task and to make possible four replica- 
tions of the judging procedure. The words were actually taken from 
the category norms for verbal items compiled by Battig and Montague 
(1969). Their norms are based on the primacy and frequency with which 
students at two large universities provided verbal associations for 56 
different verbal categories, such as a precious stone, a unit of time, 
and so forth. Four of these categories were chosen from which to se- 
lect words based on the fact that there was a correlation of .90 or 
greater between the two universities and that there was a long enough 
list of associations from which selections could be made, graded for 
their frequency in the Thorndike and Lorge count. The categories 
selected, which will hereafter be referred to only by the Roman numeral 
designation given below, were: 

I. A kind of cloth (r = .988) 

II. A kitchen utensil (r = .987) 

III. A substance for flavoring food (r = .977) 

IV. A disease (r = .906) 
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The correlations shown are those between the two university groups, and 
are based on the rank position occupied by the words within a category 
based on their frequency of mention. 

The selection of specific words from the categories was made by 
reference to the Thorndike and Lorge word count using the following 
guidelines, which could be applied only approximately. Twelve words 
were chosen from each category to make a test list. The 12 words were 
further divided into approximately four groups with at least a 5 to 10 
percent difference in frequency of occurrence between each group, based 
on the Thorndike and Lorge general (G) count. Between items in each 
group, there was a 1 to 3 percent difference in the frequency of occur- 
rence. Where there were ties in the general count, the other counts 
(T, L, and S) given in the word count were used to break the ties. Thus, 
there was a fairly reliable clustering of words into four frequency 
ranges within each list and a less reliable ranking within the frequency 
ranges. The lists are shown in Table I. Each category provided an in- 
dependent relication for validation. 

2. Subjects 

Twenty male and female Naval Postgraduate School students ranging 
in age from 24 to 37 years with comparable levels of education served 
in the validation experiment. Each subject was used twice, and 10 sub- 
jects were assigned at random to each of the four categories. 



12 



3, Procedure 

Each word list was reproduced in random order on a sheet of 
paper. The subjects were told to make an ordinal ranking of the words 
as to what they believed their relative familiarity was to all people 
in general. They were further instructed to judge only those objects 
which they could rank with confidence, make use of as many ordinal 
ranks as they deemed necessary, and to place as many objects in each 
rank as they desired. By way of guidance, they were instructed to 
select the number of ordinal ranks they were willing to use first and 
then to write the number of the rank beside the objects they chose to 
rank. They were also advised to give first impressions and work 
rapidly . 

C. RESULTS 

The orderings made by the subjects and processed by the Ford 
program are shown in Table 2, along with the Spearman rank correlation 
(rho) between the Thorndike-Lorge and Ford program orderings. The re~ 
suits will be examined in detail only for category I. 

Table 3 shows the win-loss matrix for category I. The rows (i) 
are arranged in the sequence, from top to bottom, according to their 
assigned ID numbers. When one reads across the table horizontally, he 
is reading the number of times the row item was preferred to any column 
item and the sum in the rightmost column shows how many times the row 
item "won.” When one reads down the columns vertically, he is reading 
the number of times the column item lost to the row item, and the sum 
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TABLE 1 

VALIDATION TEST LISTS WITH WORDS PRESENTED IN 
THORNDIKE-LORGE RANK ORDER WITHIN CATEGORIES 



CATEGORY I CATEGORY II 



1. 


cotton 


1. 


cup 


2. 


felt 


2. 


bowl 


3. 


wool 


3. 


knife 


4. 


lace 


4. 


fork 


5. 


velvet 


5. 


refrigerator 


6 . 


canvas 


6. 


saucer 


7. 


muslin 


7. 


sieve 


8. 


pique’ 


8. 


skillet 


9. 


rayon 


9. 


ladle 


10. 


corduroy 


10. 


scraper 


11. 


denim 


11. 


toaster 


12. 


batiste 


12. 


cleaver 



CATEGORY III CATEGORY IV 



1. 


salt 


1. 


cold 


2. 


sugar 


2. 


rheumatism 


3. 


sage 


3. 


typhoid 


4. 


ginger 


4. 


cancer 


5. 


vinegar 


5. 


smallpox 


6. 


cloves 


6. 


cholera 


7. 


mustard 


7. 


measles 


8. 


cinnamon 


8. 


rheumatic fever 


9. 


nutmeg 


9. 


syphilis 


10. 


thyme 


10. 


diabetes 


11. 


basil 


11. 


dysentery 


12. 


cayenne 


12. 


peritonitis 
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TABLE 2 

FORD PROGRAM RAMK ORDERING OF WORDS WITHIN CATEGORIES 



(Criterion rank numbers and the Pearson rank order 
correlation between the computer and criterion 
rank orders are shown.) 



CATEGORY I (rho = .521) CATEGORY II (rho = .598) 



1 . 


cotton 


4. 


fork 


3. 


wool 


6. 


saucer 


4. 


lace 


3. 


knife 


6. 


canvas 


1. 


cup 


10. 


corduroy 


2. 


bowl 


5. 


velvet 


8. 


skillet 


11. 


denim 


11. 


toaster 


9. 


rayon 


5. 


refrigerator 


2. 


felt 


12. 


cleaver 


8. 


pique ' 


9. 


ladle 


7. 


muslin 


10. 


scraper 


12. 


batiste 


7. 


sieve 



CATEGORY III (rho = .687) CATEGORY IV (rho = .460) 



1. 


salt 


1. 


cold 


2. 


sugar 


4. 


cancer 


7. 


mustard 


7. 


measles 


5. 


vinegar 


9. 


syphilis 


6. 


cloves 


10. 


diabetes 


8. 


cinnamon 


2. 


rheumatism 


9. 


nutmeg 


5. 


smallpox 


4. 


ginger 


6. 


cholera 


3. 


sage 


3. 


typhoid 


11. 


basil 


11. 


dysentery 


12. 


cayenne 


8. 


rheumatic fever 


10. 


thyme 


12. 


peritonitis 



WIN-LOSS MATRIX FOR CATEGORY 



15 





•t— > 






m 


o 


00 


eg 


m 


in 




o> 




1— 1 


m 


00 






•H 






m 


tH 




m 


m 


m 


rH 


cn 


cn 


-d- 




vD 








































C--<l 




































/ N 


































eg 


rH 


1— 1 


1— 1 




1— 1 


o 


rH 


rH 


1— 1 


rH 


rH 


O 


el 


ei 






!— 1 


























!— 1 


, 


































el 


































00 








































rH 


•H 


in 


1— 1 




eg 


m 




1— 1 


eg 


m 


o 


in 




eg 






1— 1 


























el 








'w' 




























ei 


































m 




































CO 


































Q) 




o 


O 




o 


CO 


'4' 




eg 


rH 


in 


o 


m 


00 




O 


CO 




1— 1 


































'w' 




























-d- 




































u 


































c 


































(U 




/-~s 






























u 




o> 


rH 


m 


1— 1 




eg 


m 


•<r 


ei 


O 


eg 


eg 


vO 




CO 


n3 






























el 


• 


P- 
































1— 1 


































m 


C 


































•H 






































00 


rH 


o> 


1— 1 


O 




o> 




O 


00 


00 








eg 


c 












1— 1 


















r>. 


• 




































o 
































rH 


j:: 


































CO 




y— S 
































•1— 5 




rH 




rH 


o 


cn 




O 


O 


m 




'vO 


r>. 


O 


eg 


0) 












rH 


















m 


• 


u 
































fH 


d 
































'<r 


CO 


































u 




'.O 


1— 1 


m 


1— 1 




eg 


o 


eg 


1— 1 


eg 


1— 1 


1— 1 


in 


vD 


rH 


(U 




s-x 


























eg 


• 




































e 


































p 


































c 




y^ 


































m 


»H 


m 


1— 1 


o> 


O 


r>. 


in 


eg 


<■ 


'•d- 


m 


00 


rH 




Q 






























m 


• 


M 
































00 


































ei 


"O 


































0) 


































C 




<r 


o 


eg 


rH 


o 


O 


O 


O 


O 


O 


rH 


o 


1— 1 


m 


O 


oc 
































• 


•H 
































'<r 


CO 
































a^ 


CO 


































H3 




y^ 






























1 




CO 


o 


m 


o 


in 


m 


LO 


m 


<■ 


m 


in 


m 


m 


o> 


o\ 


u 




'w' 


























'<r 


• 


OJ 


































4J 
































1 — 1 


p 


































a 




y^ 






























B 




eg 


o 


O 


o 




1— i 


•<r 


I— 1 


o 


eg 


eg 


eg 




'<d- 




o 






























eg 


• 


u 
































o> 








































y~s 


































1— 1 


o 




eg 


m 


m 


m 


m 


'<r 


m 




m 


m 


el 








v-x 


























m 


• 


































rH 


































1— 1 



































(U 




















U 




y~N 






Ui 




•- 


C 




CO 




c 


u 




o 




CO 


u 




CO 




<D 


O 




cd 


B 


•H 


OJ 


c 


p 




OJ 


C 




•H 


cu 


P 


u 


4-J 


> 


•H 


1— 1 


> 


o 


no 


1—1 


CO 


Q) 




U 


o 


cr 


H 


1— 1 


c 


C 


CO 


1—1 




H 


o 


CO 


a 


•H 


cd 


cd 


•H 


o 


Q) 


cd 


(D 




(U 


cd 


O 


o 


o 


u 








Pm 


u 


pH 


CJ 


Q 




> 




a 




1— 1 


Q) 




























'S^y' 


CM 






y'-N 


y'—V 






y— N 


y— N 


y-^ 


y~s 


y~s 


y— N 




B 


c 




t— 1 


eg 


ei 


■<J- 


in 


vO 


e'* 


00 


a^ 


o 


rH 


eg 


0 


•H 




'w' 


'w' 












S—r 


'w' 


iH 


1—1 


rH 


in 





16 



nt the bottom of the columns show the frequency of losses. Within the 
matrix, any entry shows how many times a comparison was made between 
the two items involved. For example, the maximum number of 10 compari- 
sons was only made between cotton and denim and cotton and muslin. In 

the matrix notation, these would be a, and a, The win percents 

4,7 4,8 

which would be used as the initial weights in equation (1) are shown 
below the column sums. A comparison of the rankings which would be 
made on the basis of the Thorndike-Lorge Count, the Ford program scal- 
ing, and the win percent are shown in Table 4. A graph showing how 
the weights change per iteration is presented in Figure 1. 

The observed rank correlation of .521 between the Thorndike- 
Lorge and category I rankings is not as high as one would like. An 
examination of the rankings showed a great discrepancy for the word, 
felt. Two good reasons can be given for this discrepancy with the 
benefit of retrospect. First, it was found that ’’felt" in the Thorndike- 
Lorge count includes the past tense of ’’feel’’, which would account for 
its high position in the word count. The cloth, felt, is included also. 
The subjects were, of course, ranking the latter use of the word. Se- 
cond, the Thorndike-Lorge count was published in 1944 and the use of 
felt has diminished greatly since then so that it is not as familiar 
to a new generation of persons. Recomputation of the correlation for 
category I with felt removed resulted in a rho of .788. 

Similarly, rho of .460 was disappointing for category IV (Table 
2). Inspection of the differences in rankings showed typhoid and 
syphylis occupying diametrically opposite positions in the two rankings 
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TABLE U 


COMPARATIVE 


RANK ORDERING OF CATEGORY I ITEMS 


Thorndike-Lorge 


Ford Program Win Percent 


1. Cotton 


1 1 


2. Felt 


9 9 


3. Wool 


2 2 


4 . Lace 


3 3 


5. Velvet 


6 6 


6 . Canvas 


4 4 


7. Muslin 


11 10 


8. Pique ^ 


10 11 


9 . Rayon 


8 7 


10. Corduroy 


5 5 


11. Denim 


7 8 


12. Batiste 


12 12 



Weight 

10.24 
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(Table 1). The differences could again be accounted for by changing 
trends in the incidence of the diseases and the openness with which 
syphylis is mentioned today compared with 1944. Moreover, the mili- 
tary personnel who served as subjects would be more sensitive to 
syphylis as a disease than the population at large owing to the em- 
phasis given venereal disease prevention in the military services. 

With the differences in the observed ranks halved for the two diseases, 
rho for category IV was increased to .585. With these two changes, 
each of the four obtained correlation coefficients was found to be 
significantly different from a hypothesized rho of zero by a 2-tailed, 
t test at the .05 level. 

Table 4 also suggests that the win percent calculated from the 
win-loss matrix is closely related to the final ordinal rankings of 
the items. In consonance with this observation, it was found that 
rank order stability was reached after the first iteration for cate- 
gories I, III, and IV and after the third iteration for category II. 
Category I converged in 35 iterations and category III, in 16. No 
convergence was reached for categories II and IV after 50 iterations. 
Four objects in category III were rated as universal highs and were 
removed prior to computation of weights. 

D. DISCUSSION AND SUMMARY 

To recapitulate, the validation procedure used 20 individuals 
who were assigned in groups of 10 to four tasks requiring them to make 
ordinal judgments that were made purposefully difficult. 



The results 
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showed tfiat in alj. four cases the judgments made by the group were 
significantly related to the criterion, tiiat ordinal rankings of the 
judged items were made quickly and efficiently, and that in two of the 
four tasks, the numerical scaling of the ranked items had converged 
to a stable position. The magnitude of the corrected correlation 
coefficients showed that approximately 30 to 60 percent of the total 
variance was accounted for in the correspondence between judgments 
and the criterion. This is considered excellent in view of the many 
factors that operated to attenuate the correlation coefficients. First, 
as mentioned above, the criterion was based on old information. More- 
over, the criterion was based on a word count m.ade entirely from printed 
materials, whereas the task given the judges implied familiarity of 
the words based on all contexts. Too, the Thorndike-Lorge word count 
used all meanings of the words — e.g., ginger as a seasoning and a girl^s 
name, sage as a seasoning and a wise man — whereas their familiarity was 
judged in the specific category specified. Additionally, the crucial 
assumption that made this validation possible — that familiarity with 
verbal materials is related to their frequency of occurrence in the 
language — is in itself not a perfect relationship. Another factor that 
was no doubt a severe constraint on the magnitude of the correlations 
was the way the words were chosen for the lists. That is, there was 
a very minute difference in the frequency count of some words within 
their selection bands. In fact, two words in one of the middle bands 
and all four words in the bottom band of the category I list were tied 
in frequency in the Thorndike and Lorge general count. This was done 
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to ensure that there would be a large number of ties in the rankings 
of the judges in order to make a thorough test of the Ford program. 
Considering the total impact of these attenuating factors, the obtained 
correlation coefficients are very high and provide strong evidence for 
the efficiency of the Ford ranking procedure and the Pelz and Andrews 
computer program as adapted for the Naval Postgraduate School's IBM 
360/67 system. 



IV. TRIAL APPLICATION OF THE FORD PROGRAM 
A. PROBLEM SELECTION 

It has been shown that the Ford program is effective in taking 
ratings of judges with respect to an abstract, qualitative dimension 
and scaling them. The next and final step in this project is to deter- 
mine whether the procedures can be efficiently and effectively applied 
to a practical problem. If the former test can be considered a vali- 
dation of the program, the next step could be called a trial applica- 
tion of the program. 

It would be desirable to have the trial application duplicate 
in detail a planned or proposed actual use of the Ford program. Now, 
it was emphasized in the previous report (Arima, 1971) that proper 
project selection was a crucial component of successful laboratory 
management. Dr. Donald F. Hornig, then director of the Office of 
Science and Technology in the Executive Office of the President, was 
quoted as saying that one of the most critical questions in the effec- 
tive utilization of Federal laboratories was ‘‘The choice of problems. 
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their significance, and the feasibility of finding solutions through 
research and development . . . (Subcommittee, 1968; p. 9). One way 

to improve project selection might be to examine current projects for 
their significance using representatives of sponsor and using agencies, 
and to examine the feasibility of finding solutions through research 
and development by having in-house scient if ic/ technical personnel eval- 
uate current projects from this standpoint. This line of reasoning led 
the trial application of the Ford program to the problem of evaluating 
the significance of current programs. 

B. METHOD 

1. Stimulus Materials. 

As part of the review of in-house laboratories being conducted 
by the Director of Defense Research and Engineering, the Director of 
Navy Laboratories by letter dated 25 March 1971 requested various ac- 
tivities within the Navy to document significant contributions and 
accomplishments by their in-house laboratories. Using the material 
prepared in response to this request by the Personnel Research Divi- 
sion, Bureau of Naval Personnel, for the Navy*s personnel research 
laboratories, 10 programs were selected at random as items to be rated 
for this trial application of the Ford program. The project descrip- 
tions given in the report were edited and condensed, in some cases, 
and appear in Appendix IV. A listing of the programs chosen is shown 
below. The numbers and/or the short title (in parentheses) given in 
the listing will hereafter be used to reference and identify the pro- 



grams. The programs were: 
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(1) Improved Enlisted Personnel Distribution and Management 
(Personnel Distribution) 

(2) Ship Manning Requirements Techniques (Manning Requirements) 

(3) Evaluation of Standards for Navy Reenlistment (Reenlistment 
Standards) 

(4) Development of Navy Military Personnel Costing Techniques 
for Use in Determining Cost Implications Associated with Changes in 
Reenlistment Rates (Reenlistment Costing) 

(5) Design of an Optimum Personnel Force Structure (Personnel 
Structure) 

(6) Interest Measurement in Officer Selection (Officer Selection) 

(7) Evaluation Survey of the Effectiveness of Submarine Sonar 
Operator Training (Sonar Training) 

(8) Marginal Personnel/Minority Group Testing (Personnel Testing) 

(9) Personnel Cost Research for Early Man/Machine Design Trade- 
Offs (Man-Machine Costs) 

(10) LOFARGRAM Analysis Procedures (LOFARGRAM Analysis) 

2. Subjects 

The subjects were 10 Navy officer students attending the Naval 
Postgraduate School . 

3. Procedure 

The method was essentially identical to the validation procedures. 
Each subject was given a copy of the research programs (Appendix IV) and 
instructed to make an ordinal ranking of the items with respect to their 
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desirability and need for retention and further development as research 
programs within the Navy. As before, they were told to rank only those 
items which they could with confidence, to use as many ranks as they 
deemed necessary, and to place as many programs as they desired in any 
ranking category. They were advised to review the programs first and 
then decide on the number of ranking categories to use. Having done 
this, they wrote the number of the rank chosen beside the program des- 
cription. Cards were keypunched from these data and run through the 
Ford program. 



C. RESULTS 



The rankings given the 10 programs by the 10 judges are shown in 
Table 5. The smallest number of programs ranked was four by judge number 
six. Another judge ranked 8 items, and the other eight judges ranked 
all programs. Of the latter, five judges used three categories; one used 
four; another five; and another, 10 categories. The number of comparisons 
made by each judge is shown in Table 6 for a total of 312 comparisons. 

The win-loss matrix is shown in Table 7 with sums of wins (a..) 



11 

and losses right and bottom margins, respectively. There 

were no universal highs or lows. Only 14 iterations were required to 
achieve stable weights at the .005 criterion. The program used 7.55 secs, 
of central processor unit time. Table 8 shows a summary of the results. 
The items are listed in the ordinal order of final ranks and show the num- 



ber of comparisons in which each item was involved (sums of wins and 
losses), the win percent, and the final weights. 
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TABLE 5 

RANKINGS OF TRIAL APPLICATION PROJECTS BY INDIVIDUAL JUDGES 





Proj ect 










Judges 














1 


2 


3 


4 


5 


6 


2 


8 


9 


10 


1. 


Personnel Dis tribution 


2 


2 


1 


1 


1 


1 


2 


1 


2 


1 


2. 


Manning Requirements 


2 


1 


1 


5 


2 




1 


2 


1 


7 


3. 


Reenlistment Standards 


1 


1 


3 


7 


3 




1 


1 


1 


5 


4. 


Reenlistment Costing 


1 


2 


2 


4 


3 




1 


2 


3 


2 


5. 


Personnel Structure 


2 


1 


1 


6 


2 




3 


1 


4 


4 


6. 


Officer Selection 


3 


3 


1 


3 


2 


2 


1 


L 


5 


6 


7. 


Sonar Training 


1 


2 


3 




3 


2 


4 


3 


3 


3 


8. 


Personnel Testing 


3 


1 


2 


8 


2 




4 


3 


2 


10 


9. 


Man-machine Costs 


2 


3 


2 


2 


3 




3 


2 


4 


9 


10. 


LOFARGRAM Analysis 


2 


2 


3 




L 


1 


4 


3 


4 


8 
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TABLE 6 

NUMBER OF COMPARISONS MADE BY EACH JUDGE IN 
THE TRIAL APPLICATION TEST 



Judge Number 
1 
2 

3 

4 

5 

6 

7 

8 
9 

10 



Number of Comparisons 

31 

32 

33 
28 

32 
4 

35 

33 
39 
45 



TOTAL 



312 



WIN-LOSS MATRIX FOR THE TRIAL APPLICATION TEST 
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TABLE 8 

SUMMARY RESULTS OF THE TRIAL APPLICATION TEST 

Number of 





Proj ect 


Comparisons 


Win Percent 


1. 


Personnel Distribution 


66 


80.3 


2. 


Manning Requirements 


60 


68.3 


3. 


Reenlistment Standards 


62 


64.4 


4. 


Reenlistment Costing 


63 


60.2 


5. 


Personnel Structure 


60 


60.0 


6. 


Officer Selection 


66 


47.0 


7. 


Sonar Training 


59 


35.4 


8. 


Personnel Testing 


66 


28.8 


9. 


Man-machine Cos ts 


64 


29.7 


10. 


LOFARGRAM Analysis 


58 


29.3 



Final 

Weights 

1.606 
.868 
. 702 
.620 
.612 
.338 
.217 
.175 
.174 
.171 
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D. DISCUSSION AND SUMMARY 

While the number of judges and the number of alternatives eval- 
uated were small, consistent trends were evident. Except for one judge 
who only contributed four comparisons, all the other judges contributed 
from 28 to 45 comparisons, showing that any judge makes a significant 
contribution to the total number of judgments, even if he does not rank 
all items and uses few rank categories. Similarly, in spite of the 
freedom permitted the judges in choosing items to rate and the number 
of rating categories, the entries in Table 8 show that all items entered 
into a fairly uniform number of comparisons with a range from 58 to 66. 
Obviously, both of these distributions will depend on the sample of 
judges and the types and number of alternatives to be judged, but it is 
apparent from this trial that there will be a central tendency in the 
number of categories judges will choose to use and the number of altern- 
atives a judge will adjudicate. Similarly, the alternatives will tend 
to attract a fairly uniform number of comparisons over a number of 
judges. Moreover, when the choices are difficult, there will probably 
not be any universal highs or lows, thanks to those who bet the long 
shots and the other who will give the lowest underdog a boost. The 
most important finding, however, was that the weights stabilized rapidly, 
indicating that a group of judges can achieve reasonable consensus in 
their composite judgment. Finally, the efficiency of the system was 
revealed by the very short computer time required for the scaling. 

Five of the rated programs could be identified in the work plans 
of the two laboratories with some degree of certitude. From these 
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descriptions, the five were ranked according to FY1971 expenditures 
for each program, and a Pearson rank correlation coefficient was cal- 
culated with the ranks of the programs based on their weights obtained 
from the 10 judges. The obtained rho was .60, which suggests that 
there is a relationship between the amounts being invested in these 
research projects and the combined judgments of Naval officers who 
are representative of user elements of the Navy. This trend lends 
credence to the suggestion presented above, that the Ford program 
might well be used to analyze project selection based on the relation- 
ship between funding and user ratings, professional estimates of feas- 
ibility of finding solutions through research, and the resources ac- 
tually being programmed for the projects. 

V. ADDITIONAL APPLICATIONS 



A. SITUATION 

Concurrent with this study, an investigation was being made into 
the relative values of the major segments of the Naval Postgraduate 
School *s operations research courses as seen by the student. One group 
of 54 graduating students in the operations analysis curriculum and 
another group of 15 graduating students in various management curricula 
had been asked to rank nine program segments in the operations research 
list of courses. The data lay unanalyzed because of the many ties (which 
were permitted) and because students had ranked different numbers of the 
program segments. (They could not rank courses they had not taken.) 
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B, RESULTS 

The data were in a form that would be obtained in an application 
of the Ford procedure. Accordingly, they were run through the Ford pro- 
gram with a convergence criterion of .005. The criterion was reached 
in 14 iterations for the 54 operations analysis students and in 24 iter- 
ations for the management students. The result was a useful scaling of 
the items for the purposes that had motivated their collection. 

C. COMMENTS 

This application in a genuine research setting shows the utility 
of the Ford program. It confirms statements made above in the discus- 
sion of the trial application test that a consensus — in the form of weight 
convergence — is rapidly reached when knowledgeable judges rate clearly 
defined, real-world alternatives. One must conclude that the Ford pro- 
gram could be used to good advantage in the many, ever increasing, dif- 
ficult, decision situations which are currently arising in which value 
judgments made by individuals are the major sources of data. It should 
be noted, too, that the data had been collected in a manner that was 
identical to an application of the Ford procedure. In this case, how- 
ever, circumstances dictated that they be collected in this fashion. 

That is, the investigators felt that, to get a valid sampling of opinions, 
the individual judge had to be permitted to use the number of rating 
categories he desired (effectively accomplished by permitting multiple 
ties) and to refrain from adjudicating those items with which he was 
not familiar. That these elements should be characteristic of a good 
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scheme for coiiecting qualitative judgments was mentioned in the in- 
troductory portions of this study. 

VI. SUMMARY AND CONCLUSIONS 

The purpose of this study was to develop, validate, and test 
the feasibility of a procedure for obtaining qualitative judgments 
from individuals to be used in evaluating the effectiveness and opera- 
tions of the Navy’s in-house, personnel research laboratories. The 
Ford procedure for scaling partially ordered rankings, as programmed 
by Pelz and Andrews, was further programmed for the Naval Postgraduate 
School’s IBM 360/67 system. The procedures and program were validated 
using an arbitrary, abstract task for which there was an extrinsic 
criterion and tested for feasibility in research evaluation using des- 
criptions of actual program projects. In both cases, the results were 
highly satisfactory . 

It can be concluded that the Ford procedure and present program 
can be used to obtain qualitative judgments with accuracy and efficiency. 
The utility of the program is limited only by the imagination and crea- 
tivity of the user in devising appropriate rating schemes for his pur- 
pose. It should be a very useful tool for the many researchers who today 
are faced with analyzing "quality of life" variables for which conven- 



tional measurements do not exist. 
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APPENDIX I 

FLOW CHART OF THE FORD PROGRAM 
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START 
DIMENSION 
& COMMON 



READ Ar^^D 
WRITE 
INFO FROM 
LABEL 
CARD 



REWIND 

9 

(DISK) 



{ 



Zero disk for maximum usage 



RFAD N,JJ, 
EPSLON 
J UPPER 



{ Read parameter val 
N = // of objects b 
JJ = // of judges 



ues 

being judged 



ZEP.O 

NR(I) AND 
NC ( I ) 




t 


NCOUNT = 0 






J1 = 1,JJ 


N 


f 


NCNTl = 


NCOUNT 



EPSLON = convergence criterion 

JUPPER = max # of iterations w/out convergence. 



{ 



Sets // of comparisons counter to zero 



READ 

NG 






// of ranking categories for judge being 
considered 



Y 
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> 








ZERO 
ARRAY 
A(1 ,J) 






> 


< 








NCNTl 


= 1,NC0UNT 




> 





READ Ml, M2 
(DISK) 



A (Ml, M2) = A (Ml, M2) 4 - 1 



Tabulates for each individual comparison 
the number of times that comparison is 
made by all judges in experiment. Since 
it was done sequentially from ranking 
order, the tabulation is the number of 
times i was preferred to j, i.e., 
the win - loss matrix. 



ITER = 0 



I = 1,N 

V~ 



MAN(I) = I 



~r 

(D 
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00 T^'. 100 


' 


( 




IT’EW 


= 0 





WRITE WIT- 
LOSS lUTRIX 



Logic switch to determine if all objects 
have been rated both high and lov; - if yes, 
i.e., NR*KC ^ 0, goes to compute weighting 
factors, if no, i.e., LR*NC = 0, must de- 
termine which are high and which low to 
e] iminate these . 




I TEW = I NEW + L 





JNEW = 1 

: 1 L 




Nl 


,\l>(HvEW) = rL\N(l; 



Removes 
I universal highs 
I and universal 
lows from win- 
I loss matrix. 

This reduced 
matrix is used 
|to compute weight- 
I ing factors. 

I 



Determines 
universal highs 
and lows 




WRITE MAN(I), 
ITER 

(UNIVERSAL) 

( HIGH ) 







GO 


TO 10 
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APPENDIX 



II 



DATA ASSEMBLY FOPv INPUT TO THE FORD PROGRAM 
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DMA CARD SET UP 

A. L ABEL CARD - Type "1" in col. one, then any 71H. (This will be 

out by machine) . 

B. PA Ri\METER CARD - All numbers right adjusted. Omit all leading 

zeros . 

Col. 1-6 - Total # of objects being compared by all judges ^ 130 
Col. 7-12 - # of judges 130 

Col. 13-18 - Convergence criterion (.005 presently used) 

Col. 19-24 - Max # of iterations 

C- JUDGE CARD - Right adjusted. Omit leading zeros. 

Col. 1-6 “ // of ranks used by judge ^ 130 

DATA CARD - Right adjusted. Use leading zeros. 

Col. 1-3 - // of objects placed in this rank by judge. 

Col. 4-6 - ID // of object (original ID //) 

7-9 - 

70-72 

Continue with as many cards as necessary to fill out rank. Sub- 
sequent cards begin ID # Col. 1-3. 

Repeat and for each judge. 
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CARD ASSEMBLY 
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APPENDIX III 
THE FORD PROGRAM 
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A SEQUENTIAL ID NUMBER IS ASSIGNED TO THE ORIGINAL 
ID NUMBER OF THE OBJECTS BEING JUDGED IN ORDER OF 
THEIR APPEARANCE IN THE DATA, UNTIL ALL OBJECTS 
BEING JUDGED ARE ACCOUNTED FOR. NO DUPLICATION OF 
ASSIGNMENTS IS MADE. 
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APPENDIX IV 

PROJECT DESCRIPTIONS FOR TRIAL APPLICATION TESTING 
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PROJECT DESCRIPTIONS FOR TRIAL APPLICATION TESTING 

1 . TITLE : Improved Enlisted Personnel Distribution and 

Management . 

DESCRIPTION : A computer assisted distribution and assign- 

ment (CADA) system is being designed to help improve the 
utilization of enlisted manpower. Preliminary model cur- 
rently is being implemented in the Pacific Fleet. Proto- 
type model is now under development for application in 
BUPERS in support of centralized management of enlisted 
ratings. Related research results include development of 
computer and mathematically based procedures for (1) the 
equitable allocation of personnel resources, (2) the 
optimal match of man and billet, (3) the identification of 
billet vacancies in order of priority, (4) the projection 
of the number of distributable assets, and (5) the feed- 
back of information on the results of distribution 
management actions. 

2. TITLE ; Ship Manning Requirements Techniques 

DESCRIPTION : The increasing sophisti fi cation and com- 

plexity of naval ships, systems, and equipments in the 
face of project volunteer and a smaller Navy requires 

the development of methods which will improve the accuracy 
of manpower requirements forecasting and manpower 



utilization . 
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A technique for defining and documenting manpower 
requirements for ships based on the application of se- 
lected work study techniques to basic manning criteria in 
each of the separate work areas aboard ship has been 
developed. It permits the production of a document which 
displays in detail the rationale for manning by ship 
classes based on equipment and required operational capa- 
bilities to meet mission assignment. 

3. TITLE ; Evaluation of Standards for Navy Reenlistment. 
DESCRIPTION ; This research was generated out of concern 
over the quality of reenlistees. Unsatisfactory perform- 
ance was costing the military services enormous amounts 
of money in such things as reenlistment bonuses and pay 
and allowances for reenlistees from whom commensurate 
service was not realized. Court and confinement costs of 
reenlistees were cited. It was suspected that personnel 
of inferior quality were being allowed to reenlist, in- 
cluding some with unsatisfactory first term records. 

In an attempt to identify unsatisfactory individuals 
prior to reenlistment, comparisons were made between un- 
satisfactory and satisfactory reenlistees on information 
available at the time of the reenlistment decision. The 
project also provided information on the effect on manning 
which would result if reenlistment standards were made 



more stringent. 
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4. TITLE : Development of Navy Military Personnel Costing 

Techniques for Use in Determining Cost Implica- 
tions Associated with Changes in Reenlistment 
Rates . 

DESCRIPTION : Thousands of skilled technicians are re- 

quired to operate and maintain the complex systems and 
equipment now in the Fleet. The Navy constantly experi- 
ences difficulty in retaining these technicians because 
of competition for them from other sectors of the 
economy . 

To alleviate this problem, several technician-oriented 
procurement programs and career incentive programs are 
employed. To facilitate evaluation of these programs, a 
methodology for determining the relative cost benefits 
associated with retention of personnel has been developed. 

5. TITLE ; Design of an Optimum Personnel Force Structure. 
DESCRIPTION ; An optimum force structure containing ap- 
propriately qualified personnel in sufficient numbers at 
least cost cannot now be certified. This project is con- 
cerned with the development of improved techniques to 
analyze and balance the relationship between personnel 
requirements and the composition of the existing force 
structure . 

6. TITLE : Interest Measurement in Officer Selection. 

DESCRIPTION : Each year several thousand young men apply 

for officer training programs at the Naval Academy and 
NROTC units at various colleges. High attrition rates 
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are experienced in both training and active duty. To 
reduce the cost of losing substantial proportions of 
these inen, it is imperative that those applicants having 
the greatest career potential be identified in the selec- 
tion process. Several years of research on vocational 
interest tests and biographical questionnaires have made 
it possible to identify those applicants most likely to 
successfully complete officer training and remain in the 
Navy after completing their minimum requirements. 

7. TITLE : Evaluation Survey of the Effectiveness of Sub- 

marine Sonar Operator Training. 

DESCRIPTION : A comprehensive survey was accomplished of 

the proficiency, training, and utilization of submarine 
sonar technicians and sonar watchstande rs . The survey 
provided up-to-date information concerning the efficiency 
of training procedures. Such information is necessary on 
a periodic basis to insure appropriate alignment of the 
training to fleet requirements in order to prevent seri- 
ous impairment of operational fleet submarine asw 
efficiency. Data gathering instruments included interview 
forms, self ratings, supervisor ratings, knowledge tests, 
and performance tests. 

8. TITLE : Marginal Personnel/Minority Group Testing. 

DESCRIPTION ; Present test batteries used in both military 
and civilian settings have been criticized for alleged 
inequities when used with groups defined on the basis of 
race or ethnic affiliation. Public policy as well as 
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efficient manpower utilization requires that all personnel 
be afforded equality of opportunity in assignment and that 
those abilities being measured bear relevance to skills 
required on-the-job. 

9. TITLE : Personnel Cost Research for Early Man/Machine 

Design Trade-Offs. 

DESCRIPTION ; The critical element of personnel cost has 
not been systematically considered when making system 
design and development decisions early in the system de- 
velopment cycle. No tools exist to enable the cost- 
effectiveness of such decisions to be measured. For this 
reason, research was undertaken to develop a personnel 
cost model for use in personnel and man -equipment trade off 
decisions. A basis model was accomplished which allowed 
the identification of all pertinent cost items and the 
accumulation of cost elements in an unequivocal manner. 

10. TITLE : LOFARGRAM Analysis Procedures. 

DESCRIPTION : The airborn JEZEBEL system has shown great 

potential as a means of detecting and classifying under- 
water contacts; however, its usefulness has been continu- 
ally hanpered by the lack of adequately trained operators. 
One of the main reasons for operator deficiencies is that 
training programs have been seriously hampered by the lack 
of a standardized, systemic procedure for analyzing the 
information displayed on the gram which is the main display 
component of the system. 
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In order to correct this situation, 
LOFARGRAM procedure was developed. 



a systematic 
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